Search CORE

22 research outputs found

Structuring and extracting knowledge for the support of hypothesis generation in molecular biology

Author: A Gomez-Perez
Andrew P Gibson
B Smith
C Goble
CA Goble
CD Manning
CJ Mungall
DA Moreira
DL Rubin
E Neumann
Edgar Meij
EJ Meij
G Antoniou
I Spasic
IH Witten
J Broekstra
JA Kors
Konstantinos Krommydas
LD Stein
LJ Post
M Ashburner
M Missikoff
M Scott Marshall
M Weeber
MA Inda
Marco Roos
Martijn Schuemie
O Tuason
P Fisher
P Missier
P Romano
Pieter W Adriaans
PJ Verschure
R Hoehndorf
R Jelier
R Stevens
R Witte
S Jupp
S Katrenko
S Katrenko
S Katrenko
Sophia Katrenko
T Clark
Willem Robert van Hage
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Background: Hypothesis generation in molecular and cellular biology is an empirical process in which knowledge derived from prior experiments is distilled into a comprehensible model. The requirement of automated support is exemplified by the difficulty of considering all relevant facts that are contained in the millions of documents available from PubMed. Semantic Web provides tools for sharing prior knowledge, while information retrieval and information extraction techniques enable its extraction from literature. Their combination makes prior knowledge available for computational analysis and inference. While some tools provide complete solutions that limit the control over the modeling and extraction processes, we seek a methodology that supports control by the experimenter over these critical processes. Results: We describe progress towards automated support for the generation of biomolecular hypotheses. Semantic Web technologies are used to structure and store knowledge, while a workflow extracts knowledge from text. We designed minimal proto-ontologies in OWL for capturing different aspects of a text mining experiment: the biological hypothesis, text and documents, text mining, and workflow provenance. The models fit a methodology that allows focus on the requirements of a single experiment while supporting reuse and posterior analysis of extracted knowledge from multiple experiments. Our workflow is composed of services from the 'Adaptive Information Disclosure Application' (AIDA) toolkit as well as a few others. The output is a semantic model with putative biological relations, with each relation linked to the corresponding evidence. Conclusion: We demonstrated a 'do-it-yourself' approach for structuring and extracting knowledge in the context of experimental research on biomolecular mechanisms. The methodology can be used to bootstrap the construction of semantically rich biological models using the results of knowledge extraction processes. Models specific to particular experiments can be constructed that, in turn, link with other semantic models, creating a web of knowledge that spans experiments. Mapping mechanisms can link to other knowledge resources such as OBO ontologies or SKOS vocabularies. AIDA Web Services can be used to design personalized knowledge extraction procedures. In our example experiment, we found three proteins (NF-Kappa B, p21, and Bax) potentially playing a role in the interplay between nutrients and epigenetic gene regulation

Crossref

VU Research Portal

Springer - Publisher Connector

PubMed Central

EUR Research Repository

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Overview of BioCreative II gene mention recognition.

Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop. In this task participants designed systems to identify substrings in sentences corresponding to gene name mentions. A variety of different methods were used and the results varied with a highest achieved F1 score of 0.8721. Here we present brief descriptions of all the methods used and a statistical analysis of the results. We also demonstrate that, by combining the results from all submissions, an F score of 0.9066 is feasible, and furthermore that the best result makes use of the lowest scoring submissions

epublications@Marquette

Fraunhofer-ePrints

PubMed Central

Edinburgh Research Explorer

Publications at Bielefeld University

Apollo (Cambridge)

White Rose Research Online

UvA-DARE

International Migration, Integration and Social Cohesion online publications

A local alignment kernel in the context of nlp

Author: Pieter Adriaans
Sophia Katrenko
Publication venue
Publication date: 01/01/2008
Field of study

This paper discusses local alignment kernels in the context of the relation extraction task. We define a local alignment kernel based on the Smith-Waterman measure as a sequence similarity metric and proceed with a range of possibilities for computing a similarity between elements of sequences. We propose to use distributional similarity measures on elements and by doing so we are able to incorporate extra information from the unlabeled data into a learning task. Our experiments suggest that a LA kernel provides promising results on some biomedical corpora largely outperforming a baseline.

CiteSeerX

Crossref

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Semantic types of some generic relation arguments: Detection and evaluation

Author: Pieter Adriaans
Sophia Katrenko
Publication venue
Publication date: 01/01/2008
Field of study

This paper presents an approach to detection of the semantic types of relation arguments employing the WordNet hierarchy. Using the SemEval-2007 data, we show that the method allows to generalize relation arguments with high precision for such generic relations as Origin-Entity, Content-Container, Instrument-Agency and some other.

CiteSeerX

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Bootstrapping language associated with biomedical entities

Author: Edgar Meij
Sophia Katrenko
Publication venue
Publication date: 01/01/2007
Field of study

recognizing topic-specific entities in the returned passages. To address this task, we have designed and implemented a novel data-driven approach by combining information extraction with language modeling techniques. Instead of using an exhaustive list of all possible instances for an entity type, we look at the language usage around each entity type and use that as a classifier to determine whether or not a piece of text discusses such an entity type. We do so by comparing it with language models of the passages. E.g., given the entity type “genes”, our approach can measure the gene-iness of a piece of text. Our algorithm works as follows. Given an entity type, it first uses Hearst patterns to extract instances of the type. To extract more instances, we look for new contextual patterns around the instances and use them as input for a bootstrapping method, in which new instances and patterns are discovered iteratively. Afterwards, all discovered instances and patterns are used to find the sentences in the collection which are most on par with the requested entity type. A language model is then generated from these sentences and, at retrieval time, we use this model to rerank retrieved passages. As to the results of our submitted runs, we find that our baseline run performs well above the median of all participant’s scores. Additionally, we find that applying our proposed method helps those entity types most for which there are unambiguous patterns and numerous instances.

CiteSeerX

International Migration, Integration and Social Cohesion online publications

Preface

Author: Dmitry Sustretov
Janneke Huitink
Janneke Huitink
Sophia Katrenko
Sophia Katrenko (editors
Publication venue
Publication date
Field of study

This volume contains the papers that will be presented at this year’s Student Session o

CiteSeerX

Finding Constraints for Semantic Relations via Clustering

Author: Adriaans Pieter
Katrenko Sophia
Publication venue
Publication date: 01/01/2010
Field of study

Automatic recognition of semantic relations constitutes an important part of information extraction. Many existing information extraction systems rely on syntactic information found in a sentence to accomplish this task. In this paper, we look into relation arguments and claim that some semantic relations can be described by constraints imposed on them. This information would provide more insight on the nature of semantic relations and could be further combined with the evidence found in a sentence to arrive at actual extractions

Utrecht University Repository

International Migration, Integration and Social Cohesion online publications